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DETAILED ACTION 
Response to Amendment 

1 . In response to the Office Action mailed 12/24/03, Applicant has submitted an 
Amendment, filed 5/20/04, amending Specification to correct informalities and amending 
claim 8 to overcome Examiner's 35 U.S.C. 112 rejection. 

While this lead to withdrawal of the objections to Specification and the 35 U.S.C. 
112, second paragraph, claim rejections, the 35 U.S.C 102 (e) claim rejections remain, 
for the reasons given below in Response to Arguments. 



Response to Arguments 

2. Applicant's arguments have been fully considered but they are not persuasive. 

Specifically, Applicant suggests that Ladd et al. fail to teach "at least selecting a 
rule from among a plurality of rules each specifying respective voice output contents 
and voice input candidates, and analyzing an obtained document based on the rule 
selected," as recited in claim 1 (Page 20). 

In the previous Office Action, Examiner listed several reasons for why Ladd et 
al.'s invention "read on" the language of claim 1. For example, Ladd et al. teach parsing 
the document based on the rules of the markup language (Col. 12, lines 18-20). As 
shown in Figure 6, the markup language document (XML) contains sections (inside 
<DIALOG> tags) that are the rules for interpreting the body of the document. It is 
inherent that each XML document will have at least one or more DIALOG sections, 
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each covering a specific type of the machine-user dialog. This part of the XML 
document structure "reads on" the "plurality of rules" language in claim 1. 

Regarding the "rules each specifying respective voice output contents and voice 
input candidates," Figure 6 shows the <PROMPT> tags that provide "output contents" 
("What meal would you like to hear the specials for?") and the <OPTION> tags which 
specify "input candidates" (Lunch, Breakfast, Dinner). Note that <INPUT TYPE = 
OPTIONLIST> elements may contain direct instructions to "fetch" additional list 
components via SQL calls (Col. 41 , lines 45-50). Because of these commands, the 
software will inherently fetch the additional voice input or output contents/candidates 
(See the rest of the example code on Cols. 41-42). Finally, the interpreter unit parses 
each document based on the structure of the DIALOG sections (Col. 13, lines 52-59). 



Claim Rejections - 35 USC § 102 



3. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 
A person shall be entitled to a patent unless - 

(e) the invention was described in (1) an application for patent, published under section 122(b), 
by another filed in the United States before the invention by the applicant for patent or (2) a 
patent granted on an application for patent by another filed in the United States before the 
invention by the applicant for patent, except that an international application filed under the treaty 
defined in section 351 (a) shall have the effects for purposes of this subsection of an application 
filed in the United States only if the international application designated the United States and 
was published under Article 21(2) of such treaty in the English language. 
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4. Claims 1-7, 9-15 are rejected under 35 U.S.C. 102(e) as being anticipated by 
Ladd et al. (6,269,336 filed 10/2/1998). The table below summarizes the limitations of 
these claims and teachings in Ladd et al. that meet these limitations. 



Claim 
# 


Limitations 


Ladd et al. 


1 


A document processing apparatus comprising: 






document obtaining means for obtaining a 


The network access apparatus of the 




document written in a Dredetermined markup 


system allows the user to access (i.e., view 




lanauaae from a desianated source from which 


and/or hear) the information retrieved from 




the document is to be obtained 


the information source. (Col. 3, lines 40- 






42). The information can be stored in a 






database of the information source and 






can include text content, markup lanauaae 






document or pages (Col 11, lines 42-45) 




rule selectina means for selectina a rule definina 


The parser unit receives the information 




voice inout/outout contents from a plurality of 


from the network fetcher unit and parses 




predetermined rules 


the information according to the syntax 






rules of the markup lanauaae. (Column 12, 






lines 18-20) The markup language can 






include elements that describe the 






structure of a document or paae, provide 






pronunciation of words and phrases, and 






place markers in the text to control 






interactive voice services. The markup 






language also provides elements that 






control phrasing, emphasis, pitch, speaking 






rate, and other characteristics. (Column 






16, 12-18 and FIG. 6) As seen from FIG. 6, 






the <DIALOGUE> section contains both 






input candidates and output contents, 






which may also include instructions to fetch 
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additional elements via SQL calls. (Col. 41, 






lines 45-50) 




document analvzinq means for analyzing a 


The interpreter unit determines the next 




designated range of the document obtained by 


state or step based upon the structure of 




said document obtaining means based on the rule 


the dialoa and the inputs from the user. 




selected by said rule selecting means to fetch 


When the interpreter unit transitions to a 




voice output contents, voice input candidates, and 


new dialog or page, the address of the new 




desianation information for designating a next 


dialog or page is then sent to the network 




processina obiect corresoondinp to each voice 


fetcher. (Column 13, lines 55-59) 




input candidate 






voice outputtina means for voice-outouttinq the 


The TTS unit of the VRU server receives 




voice output contents fetched bv said document 


textual data or information... The TTS unit 




analyzing means 


processes the textual data and converts 






the data to voice data or information. 






(Column 9, lines 3-10) 




voice recoanizina means for voice-recoqnizina 


The ASR unit of the VRU server provides 




the voice input bv the user 


speaker independent automatic speech 






recognition of speech inputs or 






communications from the user. (Column 






9, lines 27-30) 




controllina means for checkina the result of 


The interpreter unit can transition from 




recognition bv said voice recognizing means 


state to state (i.e., step to step) within a 




against the input candidates fetched by said 


tree structure (i.e., a dialog) of a markup 




document analvzino means to control obtainment 


language document or can transition to a 




of a new document bv said document obtaining 


new tree structure within the same dialog 




means or next analysis bv said document 


or another dialog. The interpreter unit 




analyzing means based on designation 


determines the next state or step based 




information corresponding to the input candidate 


upon the structure of the dialog and the 




matching the recognition result. 


inputs from the user. When the interpreter 
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unit transitions to a new dialoa or paae. the 
address of the new dialog or page is then 
sent to the network fetcher. (Column 13, 
lines 52-59) 


2 


The document processing apparatus according 
to claim 1, wherein said rule selecting means 
selects a rule based on rule identification 
information described in the document obtained 
by said document obtaining means. 


The voice browser determines whether the 
arammar for the user input is found in a 
predetermined or pre-existing grammar 
stored in a database or contained in the 
markup lanquaqe. (Column 14, lines 21- 
24) See description of markup language at 
Column 13, lines 52-59. 


3 


The document processing apparatus according 
to claim 2. wherein said rule identification 
information is a predetermined attribute value of a 
predetermined tap. 


markup lanauaae document includes taas 
(Column 16, line 29-31) 


4 


The document processing apparatus according 
to claim 2, wherein said rule selecting means 
selects a predetermined rule if the rule 
identification information is not described in the 
obtained document. 


If a pre-existing grammar is not found at 
block, the voice browser dynamically 
generates the grammar for the user input. 
The voice browser looks up the 
pronunciations for the user in a dictionary. 
(Column 14, lines 29-33) 


5 


The document processing apparatus according 
to claim 1, wherein said document analyzing 
means fetches as said designation information a 
source from which a next document is obtained. 


When the interpreter unit transitions to a 
new dialog or page, the address of the new 
dialoa or paae is then sent to the network 
fetcher. (Column 13, lines 55-59) The 
network fetcher unit retrieves information, 
including markup language documents, 
audio samples and grammars from the 
information sources. (Column 12, lines 
10-14) 


6 


The document processing apparatus according to 
claim 1 , wherein said document analyzing means 
fetches an analyzed ranae of a next document as 


The network fetcher unit retrieves 
information, includina markup lanauaae 
documents (Column 12, lines 10-14). 
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said designation information. 


Since network fetcher can retrieve full 
documents, it can inherently retrieve 
multiple documents specified in the 
analyzed range of a next document. 


7 


The document processing apparatus according 
to claim 1, wherein said rule selecting means 
selects a rule based on instructions from a user. 


The communication node can also allow 
the user to select a particular speech 
recoonition model. (Column 6. lines 25- 
36) or choose models based on 
<PROFILE> tag information (Col. 24, lines 
12-65) 


9 


The document processing apparatus according 
to claim 1, wherein said plurality of rules includes 
a rule which defines a predetermined attribute 
value of a predetermined taa as voice output 
contents, and contents surrounded bv 
predetermined second taas as input candidates, 
in said document. 


The PROMPT element of the markup 
lanauaae is used to define content (i.e.. 
text or an audio file) that is to be presented 
to the user. (Column 18, line 32-36). 
The INPUT element of the markup 
lanauaae is used to define a valid user 
input within each STEP element. (Column 
18, line 56-58) 


10 


The document processing apparatus according to 
claim 9, wherein in said rule, if said recognition 
result matches an input candidate, contents 
ranging from the contents surrounded by said 
second predetermined tags which correspond to 
the input candidate up to a third predetermined 
taa are defined as next voice output contents, and 
an anchor in the voice output contents is defined 
as a next input candidate. 


See example (Column 16, line 63 - 
Column 17, line 15). The page consists of 
one rule (DIALOG) encompassing 
PROMPT elements that define voice output 
contents and INPUT elements that define 
input candidates. The nature of the markup 
language is such that these elements can 
be arranged in a variety of configurations 
that limit claim 11. 


11 


The document processing apparatus according to 
claim 1 , wherein said plurality of rules includes a 
rule which defines contents ranging from the head 
of said document to a predetermined tag as voice 
output contents, and an anchor in the voice output 
contents as an input candidate. 


See example (Column 16, line 63 - 
Column 17, line 15). The page consists of 
one rule (DIALOG) encompassing 
PROMPT elements that define voice output 
contents and INPUT elements that define 
input candidates. The nature of the markup 
language is such that these elements can 
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be arranged in a variety of configurations 
that limit claim 11. 



12 The document processing apparatus according to 
claim 1 , wherein said voice input and voice output 
are performed through a telephone line . 



The telecommunication network is 
preferably connected to the communication 
node via a high-speed data link, such as, a 
T1 telephone line. (Column 5, lines 39-42) 



13 A document processing method comprising: 

a document obtaining step of obtaining a 
document written in a predetermined markup 
language from a designated source from which 
the document is to be obtained 



a rule selecting step of selecting a rule defining 
voice input/output contents from a plurality of 
predetermined rules 



a document analyzing step of analyzing a 
designated range of the document obtained in 
said document obtaining step based on the rule 
selected in said rule selecting step to fetch voice 
output contents, voice input candidates, and 
designation information for designating a next 
processing object corresponding to each voice 
input candidate 



a voice outputting step of voice-outputtino the 



The network access apparatus of the 
system allows the user to access (i.e.. view 
and/or hear) the information retrieved from 
the information source . (Col. 3, lines 40- 
42). The information can be stored in a 
database of the information source and 
can include text content, markup language 
document or pages (Col 11, lines 42-45) 

The parser unit receives the information 
from the network fetcher unit and parses 
the information according to the syntax 
rules.of the markup language. (Column 12, 
lines 18-20) See definition of markup 
language at Column 16, 12-18. 

The interpreter unit carries out a dialog with 
the user based upon the tree structure 
representing a markup language 
document . (Column 13, lines 45-47) 
When the interpreter unit transitions to a 
new dialog or page, the address of the new 
dialog or page is then sent to the network 
fetcher . (Column 13, lines 55-59) 



The TTS unit of the VRU server receives 
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voice output contents fetched in said document 
analyzing step 

a voice recoanizina steD of voice-recoanizina the 
voice input from the user 

and a controlling step of checking the result of 
recognition by said voice recognizing step against 
the input candidates fetched in said document 
analvzina step to control obtainment of a new 
document bv said document obtaining step or 
next analysis bv said document analvzina step 
based on designation information corresponding 
to the input candidate matching the recognition 
result. 


textual data or information... The TTS unit 
processes the textual data and converts 
the data to voice data or information. 
(Column 9, lines 3-10) 

The ASR unit of the VRU server provides 
speaker independent automatic speech 
recognition of speech inputs or 
communications from the user. (Column 
9, lines 27-30) 

The interpreter unit can transition from 
state to state (i.e., step to step) within a 
tree structure (i.e., a dialog) of a markup 
language document or can transition to a 
new tree structure within the same dialog 
or another dialog. The interpreter unit 
determines the next state or step based 
upon the structure of the dialog and the 
inputs from the user. When the interpreter 
unit transitions to a new dialog or page, the 
address of the new dialog or page is then 
sent to the network fetcher. (Column 13, 
lines 52-59). 


14 


A computer-executable program for controlling a 
computer to perform document processing, said 
program comprising codes for causing the 
computer to perform: 

<Text same as in claim 1 3> 


communication node can be carried out in 
the form of hardware components and 
circuit designs, software or computer 
programming, or a combination thereof. 
(Column 7, lines 14-17) 
The rest of this claim is rejected for the 
same reasons as claim 13. 


15 


A computer-readable storage medium for storing 
the program according to claim 


communication node can be carried out in 
the form of hardware components and 
circuit designs, software or computer 
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programming, or a combination thereof. 
(Column 7, lines 14-17) 



Claim Rejections - 35 USC § 103 

6. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 

obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

7. Claim 8 is rejected under 35 U.S.C. 103(a) as being unpatentable over Ladd et 
al. 

Ladd et al. do not teach assigning priorities to rules and choosing rules based on 
their respective priorities. 

However, the examiner takes the official notice that it is well-known in the art of 
speech recognition to assign priorities to speech models (which are part of the rules 
specified by the XML document in Ladd et al.'s invention) in speech recognition systems 
in order to make the selection process of required speech models more flexible to the 
user's requirements. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time the invention was made to modify Ladd et al. to assign priorities and choose rules 
based on assigned priorities because this would enable the system to be more flexible 
to the user's requirements and choose a rule that would best fit the situation. 
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Conclusion 

7. Applicant's amendment necessitated the new ground(s) of rejection presented in 
this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP 

§ 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1 .136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the date of this final action. 

8. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Dmitry Brant whose telephone number is (703) 305- 
8954. The examiner can normally be reached on Mon. - Fri. (8:30am - 5pm). 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Talivaldis Ivars Smits can be reached on (703) 306-301 1. The fax phone 
number for the organization where this application or proceeding is assigned is (703) 
872-9306. 
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Any inquiry of a general nature or relating to the status of this application or 
proceeding should be directed to Tech Center 2600 receptionist whose telephone 
number is (703) 305- 4700. 



DB 

7/14/04 



\ 




W. R. YOUNG 
PRIMARY EXAMINER 



